Skip to content

GODRIVER-3605 Refactor StringN #2128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Aug 7, 2025
Merged

Conversation

qingyang-hu
Copy link
Collaborator

@qingyang-hu qingyang-hu commented Jul 15, 2025

GODRIVER-3605
GODRIVER-3561

Summary

Refactor StringN

Background & Motivation

Originally, Element.StringN() does not truncate strings as expected.
This PR also adds a return flag to StringN() to indicate whether the string has been truncated.

@mongodb-drivers-pr-bot mongodb-drivers-pr-bot bot added the review-priority-low Low Priority PR for Review: within 3 business days label Jul 15, 2025
Copy link
Contributor

mongodb-drivers-pr-bot bot commented Jul 15, 2025

API Change Report

./v2/x/bsonx/bsoncore

incompatible changes

Array.StringN: changed from func(int) string to func(int) (string, bool)
Document.StringN: changed from func(int) string to func(int) (string, bool)
Element.StringN: changed from func(int) string to func(int) (string, bool)
Value.StringN: changed from func(int) string to func(int) (string, bool)


// If the last byte is not a closing bracket, then the document was truncated
if len(str) > 0 && str[len(str)-1] != '}' {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line causes GODRIVER-3561

@qingyang-hu qingyang-hu force-pushed the godriver3605 branch 3 times, most recently from acac4b7 to 620794a Compare July 23, 2025 22:48
@qingyang-hu qingyang-hu marked this pull request as ready for review July 23, 2025 23:18
@qingyang-hu qingyang-hu requested a review from a team as a code owner July 23, 2025 23:18
Copy link
Member

@prestonvasquez prestonvasquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work, Qingyang. Thank you for taking this on 👍


var buf strings.Builder
// stringN stringify a document. If N is larger than 0, it will truncate the string to N bytes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we clarify in the method documentation that the second parameter indicates if the stringified document was truncated?

func (d Document) StringN(n int) string {
if len(d) < 5 || n <= 0 {
return ""
func (d Document) StringN(n int) (string, bool) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we clarify in the method documentation that the second parameter indicates if the stringified document was truncated?

Copy link
Collaborator

@matthewdale matthewdale Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use named return values.

E.g.

func (d Document) StringN(n int) (val string, truncated bool)

Edit: Note that named return values are only generally a good idea on short functions (<50 lines). Disregard this suggestion if StringN becomes longer than 50 lines.

Comment on lines 270 to 276
if n <= 0 {
if l, _, ok := ReadLength(d); !ok || l < 5 {
return "", false
}
return "", true
}
return d.stringN(n)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's interesting that calling StringN(0) will return an empty string but stringN(0) will return the entire document. The asymmetry makes the code fairly difficult to read. Should we update StringN so that an N <= 0 argument returns the full document?

 func (d Document) String(n int) (string, bool) {
	if n <= 0 {
		return d.stringN(0)
	}

	return d.stringN(n)
}

Ultimately, it's unclear to me why one would call StringN(0) expecting an empty string.

Copy link
Collaborator Author

@qingyang-hu qingyang-hu Jul 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By using stringN(), we can keep the original behavior of StringN(n), which is to return an empty string when n <= 0, while also allowing it to be reused by String(). However, I will be happy to merge stringN() and StringN() if we all accept returning the entire document on StringN(0).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that makes sense. Will let @matthewdale opine.

Copy link
Collaborator

@matthewdale matthewdale Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a slight preference for -1 (i.e. all negative) vs 0 (i.e. all non-positive) because it more closely matches the semantics of string truncation (i.e. StringN(0) == String()[:0]), although there's not a clear use case for passing either -1 or 0.

Is there a reason we want to change the behavior from the original where we pass math.IntMax to get the full document?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 (i.e. all negative) vs 0 (i.e. all non-positive) seems a more sensible interface.

It's just for a better compatibility that not using a particular value like math.IntMax to get the entire document.

if !ok || length < 5 {
return "", false
}
length -= (4 /* length bytes */ + 1 /* final null byte */)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Can we re-write this to use // ?

length := (4 + // length bytes
	1) // final null bytes

truncatedStr := bsoncoreutil.Truncate(str, n-buf.Len())
buf.WriteString(truncatedStr)

for length > 0 && !truncated {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] The loop logic is hard to follow, suggest adding comments. I think the following are sufficient:

  1. determine remaining budget l := n - buf.Len()
  2. If buf.Len() >= n, then mark truncated and break
  3. if !first, then write comma, then decrement l and possibly break
  4. ReadElement, subtract its full length, exit on !ok
  5. Delegate to elem.stringN(l), write its str, record its truncated flag

l = n - buf.Len()
}
if !first {
buf.WriteByte(',')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we should be writing the comma here? Without testing directly it seems unintuitive. What if n=1 or something and there isn't enough room for a comma?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another issue is that we don't need a comma if the document only contains one element, even if there is enough space.

for _, tc := range testCases {
for n := -1; n <= len(tc.want)+1; n++ {
t.Run(fmt.Sprintf("StringN %s n==%d", tc.description, n), func(t *testing.T) {
got, _ := tc.doc.StringN(n)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should validate the expected values for the the truncation bool.

func (e Element) StringN(n int) string {
func (e Element) StringN(n int) (string, bool) {
if n <= 0 {
return "", len(e) > 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider treating n<=0 as "no limit", see comment on document.StringN().

if n <= 0 {
return ""
return "", len(str) > 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider treating n<=0 as "no limit", see comment on document.StringN().

if lens, _, _ := ReadLength(a); lens < 5 || n <= 0 {
return ""
func (a Array) StringN(n int) (string, bool) {
if n <= 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider treating n<=0 as "no limit", see comment on document.StringN().

@qingyang-hu qingyang-hu requested review from matthewdale and removed request for matthewdale July 28, 2025 17:26

for _, tc := range testCases {
for n := -1; n <= len(tc.want)+1; n++ {
t.Run(fmt.Sprintf("StringN %s n==%d", tc.description, n), func(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: Consider factoring these into two test functions TestArray_String and TestArray_StringN and moving the test cases to a package-level variable (e.g. var arrayStringTestCases = ...).


for _, tc := range testCases {
for n := -1; n <= len(tc.want)+1; n++ {
t.Run(fmt.Sprintf("StringN %s n==%d", tc.description, n), func(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: Consider factoring these into two test functions TestDocument_String and TestDocument_StringN and moving the test cases to a package-level variable (e.g. var documentStringTestCases = ...).

t.Run(tc.description, func(t *testing.T) {
got := tc.val.StringN(tc.n)
for n := -1; n <= len(tc.want)+1; n++ {
t.Run(fmt.Sprintf("StringN %s n==%d", tc.description, n), func(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: Consider factoring these into two test functions TestValue_String and TestValue_StringN and moving the test cases to a package-level variable (e.g. var valueStringTestCases = ...).

{15, `"𨉟呐㗂越"`},
{21, `"𨉟呐㗂越"`},
} {
t.Run(fmt.Sprintf("StringN multi-byte string n==%d", tc.n), func(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: Consider factoring this into a separate test function TestArray_StringN_Multibyte.

buf.WriteByte(']')
}

return buf.String()
return buf.String(), truncated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to truncate the string in the buffer?

E.g.

bsoncoreutil.Truncate(buf.String())

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not technically necessary, as the buffer has been truncated previously. However, an extra-truncating sounds like a good idea for an additional layer of security

buf.WriteByte('}')
}

return buf.String()
return buf.String(), truncated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to truncate the string in the buffer?

E.g.

bsoncoreutil.Truncate(buf.String())

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not technically necessary, as the buffer has been truncated previously. However, an extra-truncating sounds like a good idea for an additional layer of security

Copy link
Collaborator

@matthewdale matthewdale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! 👍

Copy link
Member

@prestonvasquez prestonvasquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@qingyang-hu qingyang-hu merged commit a709a1d into mongodb:master Aug 7, 2025
34 of 37 checks passed
qingyang-hu added a commit that referenced this pull request Aug 7, 2025
qingyang-hu added a commit that referenced this pull request Aug 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug review-priority-low Low Priority PR for Review: within 3 business days
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants